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1.  Summary 

We  characterize  the  sets  of  positive  states  and  null  states  for  nonsingular 
Markov  processes  and,  more  generally,  for  positive  contractions  in  L\.  The  set 
P  of  positive  states  is  an  invariant  set  and  carries  all  finite  invariant  measures 
which  are  absolutely  continuous  with  respect  to  a  given  measure  n,  the  initial 
distribution.  The  Cesaro  averages  of  the  "probabilities  of  being  in  B  at  time  n” 
converge  to  a  positive  limit  for  any  subset  B  of  P  with  n(B)  >  0.  The  set  N  of 
null  states  is  a  countable  union  of  sets  Xi  with  the  property  that  the  Cesaro 
averages  of  the  "probabilities  of  being  in  X”  tend  to  0  for  each  X,.  We  further 
generalize  Hopf’s  decomposition  of  the  state  space  into  a  conservative  and 
dissipative  part  by  introducing  monotonically  decreasing  weights,  obtaining  the 
positive  part  P  as  a  special  "weighted  conservative  part”  with  divergent  sum  of 
weights.  As  an  application  we  derive  an  ergodic  theorem  with  appropriate 
weighted  averages  under  conditions  which  do  not  imply  the  usual  ergodic 
theorem  (corollary  2). 

Different  characterizations  of  the  decomposition  into  P  and  N  have  been 
described  by  Mrs.  Dowker  [7]  (for  point  mappings)  and  by  Neveu  [20].  (See 
also  Neveu’s  paper  of  this  Berkeley  Symposium.  I  noticed  the  decomposition 
independently,  but  later  than  Neveu.  Also  A.  Hajian  and  Y.  Ito  have  some 
related  (so  far  unpublished)  results,  which  overlap  with  Neveu’s  present  paper 
and  are  based  on  his  paper  [20].)  I  am  indebted  to  Professors  D.  Freedman, 
Y.  Ito,  and  W.  Pruitt  for  some  references. 

2.  Introduction 

Let  ( X ,  ff,  n)  be  a  measure  space  with  n(X)  =  1.  All  sets  and  functions  intro¬ 
duced  are  assumed  to  be  measurable.  Sets  as  well  as  functions  are  identified  if 
they  coincide  almost  everywhere.  Let  T  be  a  positive  contraction  in  Li  = 
Li(X,  ff,  n),  that  is,  a  linear  operator  in  L\  with  Tf  >  0  for  all  0  <  /  e  Lh  and 
with  ||  Tj |  =  supn/ii  =  i  \\Tf\\  <  1.  By  the  Radon-Nikodym  theorem,  Li  is  isomor¬ 
phic  to  the  Banach  space  $  of  all  signed  measures  <p  which  are  absolutely  eon- 
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tinuous  with  respect  to  p:  <p  <<C  p.  The  contraction  T  induces  in  $  an  isomorphic 
operator  A  defined  by 

(1)  (A<p) (A)  =  Ja  T(d<p/dp )  dp,  (<p  e  4>) 

Conversely,  A  may  be  given  first  and  T  defined  by  <p/(A)  =  Ja  f  dp  and  Tf  — 
dw/dp.  This  is  the  case  if  A  is  given  by  a  stochastic  kernel  P(x,  A)  by  the 
relation 

(2)  (A<p)(A)  =  fx  P{x,  A)  dip ,  (<p  e  A  eJ) 

where  P(x,  A)  is  nonsingular,  that  is ,  p(A)  =  0  implies  P(x,  A)  =  0. 

Let  Ac  be  the  complement  of  A.  Functions  /  with  Tf  =  /  and  measures  <p  with 

A<p  =  <p  are  called  invariant.  A  set  I  E  $  is  called  invariant  if  Tf  =  0  on  Ic  for 
any  /  >  0  with  /  =  0  on  Ic,  or,  equivalently,  if  Acp(P)  =  0  for  any  <p  >  0  with 
<p(Ic )  =  0. 

Our  main  result  will  be  derived  from  the  following  generalization  of  a  theorem 
of  Y.  Ito  [15],  which  was  obtained  independently  by  Dean  and  Sucheston  [6] 
and  by  Neveu  [20]. 

Theorem  A.  The  following  conditions  are  equivalent: 

(i)  there  exists  a  strictly  positive  invariant  function  f  E  Lx, 

(ii)  inf„  Ann(A)  >  0  for  all  A  E  5  with  n(A)  >  0; 

(iii)  limn_>„o  (supy  n-1  £”=o  Ai+/  n(A)}  >  0  for  all  A  E  $  with  p(A)  >  0. 

Note  that  (i)  is  equivalent  to  the  existence  of  a  finite  invariant  measure  <p  >  0 

with  n<K  <p  <<C  /*.  For  references  concerning  the  existence  of  invariant  measures 
see  [22],  [15],  [10],  [16].  Some  more  conditions  are  described  in  a  paper  by 
Hajian  and  Ito  [9]. 

The  following  theorem  is  essentially  due  to  Hopf  [12]. 

Theorem  B.  The  space  X  is  the  disjoint  union  of  two  uniquely  determined  sets 
C  and  D,  respectively  the  conservative  part  and  the  dissipative  part  of  X,  such  that 

(i)  for  every  f  >  0,  Tkf  converges  on  D; 

(ii)  for  every  f  >  0,  Tkf  diverges  on  {x:  Tkf  >  0}  H  C; 

(iii)  C  is  invariant. 

Chacon  and  Ornstein  [4]  proved  the  following  theorem,  which  was  conjectured 
by  Hopf.  This  author  also  gave  a  simplified  proof  later  [12],  [13]. 

Theorem  C.  For  every  f  E  Lx  and  0  <  p  E  Lh  the  limit 

71  —  1 

E  Tkf 

(3)  lim  £=£ -  =  h(f,  p) 

E  Tkp 

k  =  0 

exists  and  is  finite  on  (z:  E“=o  Tkp  >  0}. 

Let  xa  be  the  characteristic  function  of  A.  Define  Tc  and  Td  by  Tcf  =  xcTf, 
TDf  =  xoTf  for  /  E  Lx.  Then 

Ref  =  xcf  +  Tcixof)  +  £  TcTUxnf) 

k=  1 


(4) 
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defines  a  positive  contraction.  Chacon  [2],  [3]  proved  the  following  theorem. 
Theorem  D.  The  invariant  subsets  of  C  form  a  a-field  3.  For  every  0  <  p  G  Li, 

(i)  the  function  h(f,  p)  •  p  is  integrable; 

(ii)  the  equality  h(f,  p)  =  h(Rcf,  Rep)  =  E(Rcf\ 3) /E (RcP | 3)  holds  on  C  Pi 
{x:  I"=oTV>0}. 

3.  The  positive  part  and  the  null  part  of  X 

For  0  <  f  g  Li  with  Tf  =  f,  let  P(f)  =  {/  >  0}.  It  is  easy  to  see  that  there 
is  a  maximal  set  P  among  the  sets  P(f).  We  will  obtain  P  by  a  different  approach 
and  characterize  P  in  probabilistic  terms. 

Theorem  1.  The  space  X  is  the  disjoint  union  of  two  uniquely  determined  sets 
P  and  N,  respectively  the  positive  part  and  the  null  part  of  X,  such  that 

(i)  P  is  an  invariant  subset  of  C; 

(ii)  there  exists  an  invariant  0  <  /  G  L\  which  is  strictly  positive  in  P; 

(iii)  for  any  <p  G  $  and  A  G  the  limit 

(5)  lim  n~l  £  A k<p(A  H  P)  =  X*(A) 

n—*°°  k  —  0 

exists  (X^,  g  $  is  invariant).  Letf  =  d<p/dy  \  then  we  have 

(6)  M4)  =  fAnP  /®(Sc/|3)£(/|3)_I  dii. 

Thus,  iff  >  0  and  fAnpfdu  >  0,  then  \V(A)  >  0; 

(iv)  N  =  X  —  P  is  a  countable  union  of  sets  Xi}  i  —  1,  2,  •  •  •  such  that 

(7)  lim  n_1  £  A k<p(Xi  O  A)  =  0 

n— >*>  k  —  0 

holds  for  any  A  G  <p  G  $  and  i  =  1,  2,  •  •  •  . 

Proof.  We  consider  {X,  5F,  n)  as  a  measure  algebra.  A  real-valued  function 
H  on  5?  is  called  monotonic  if  B  C  A  implies  H(B)  <  H(A).  The  construction  of 
the  decomposition  is  based  on  the  following  simple  lemma. 

Lemma  1.  If  H  is  a  nonnegative  monotonic  function  on  3  with  H( 0)  =  0,  then 
X  is  the  disjoint  union  of  two  uniquely  determined  sets  P  and  N  such  that 

(i)  H  ( B )  >  0  holds  for  all  0  B  C  P  and 

(ii)  N  is  the  disjoint  union  of  countably  many  sets  X,,  with  H(Xi )  =  0. 

Proof.  Measure  algebras  are  closed  with  respect  to  the  formation  of  arbi¬ 
trary  unions.  (Such  unions  may  always  be  replaced  by  countable  unions.)  Let 
N  be  the  union  of  all  sets  B  6  5  with  H  ( B )  =  0.  Passing  to  subsets  we  may 
assume  N  =  Uf=i  Xi  with  disjoint  X,  and  H(Xi )  =  0.  Let  P  =  X  —  N.  Then 
(i),  (ii),  and  the  uniqueness  are  obvious. 

For  any  bounded  sequence  {x„}  of  real  numbers  let 
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(this  M  is  the  maximal  value  of  Banach  limits,  (see  [6],  [22])).  The  function 
H(B )  =  M{Ann(B)}  is  monotonic.  Further,  P  and  N  are  characterized  by  the 
conditions 

(9)  M{Any(B)}  >  0  for  every  B  C  P  with  y{B)  >  0, 

(10)  N  is  the  disjoint  union  of  countably  many  sets 

Xi,  i  =  1,  2,  •  •  •  where  M{Any(Xi )}  =  0. 

Before  proceeding  with  the  proof,  we  will  collect  some  facts  about  M.  It  is 
known  and  not  difficult  to  prove  that 

(11)  M{xn  +  yn}  <  M{xn}  +  M{yn}  for  every  {z„},  { yn] } 

(12)  M{axn}  =  aM{xn}  for  a  >  0, 

(13)  M{xn}  <  sup  \xn\. 

n 

We  will  further  need  the  equation 

(14)  M{xn  +  yn)  =  M{xn  +  yn+\ }  for  every  {x„},  {yn}  el*, 

which  follows  by  an  easy  cancellation  argument. 

With  the  above,  the  proof  of  (iv)  is  immediate:  we  may  assume  <p  >  0  and 
A  =  X.  Let  /  =  d<p/dn,  fn  =  min  {/,  n} ,  gn  —  f  —  /»,  and  let  <pn  and  \f/n  be  re¬ 
spectively  the  measures  with  /„  =  dcpn/dy  and  gn  =  d\l/n/dn.  For  every  e  >  0, 
ll^noll  =  |M  <  €  for  sufficiently  large  n0.  For  every  n, 

(15)  \AMXi)\  <  |A-^(X<)|  +  |AVno(X,)|  <  «o|A-m(X<)|  +  e. 

Therefore, 

(16)  M{An<p(Xi)}  <  UqM  {Ann(Xi)}  +  €  =  e. 

Since  c  >  0  was  arbitrary,  this  proves  M{An<p(Xi)}  =  0,  a  statement  which  is 
slightly  stronger  than  (iv),  since 

n  —  1 

(17)  M{xn}  >  lim  sup  n-1  X)  £*• 

n— fc  =  0 

Now  for  every  F  e  JF  let  IV  be  the  operator  in  $  defined  by 

(18)  Ttf(A)  =  \f/(F  O  A),  A  e  JF. 

The  next  lemma  will  be  used  in  the  proof  of  the  invariance  of  P. 

Lemma  2.  Let  0  <  <p  e  <i>,  <p{Ac)  =  0,  and  A <p(E)  >  0  for  some  set  E.  Then 
there  exists  an  e  >  0  and  a  set  B  C  A  with  y(B)  >  0  such  that 

(19)  A\f/(E)  >  e\J/(B) 
for  all  0  <  \f/  e  4>. 

Proof.  Let  g  =  d<p/dy.  Then  A’  =  {x:  g{x)  >  0}  is  the  smallest  carrier  of 
4 p .  We  may  assume  that  A  =  A'.  Define  t}(F)  =  ATF<p(E).  It  is  easy  to  show 
that  7]  e  and  rj  « If  /  =  dy/d<p,  it  follows  from  r?(Z)  =  A <p(E)  >  0  and 
r?  «  <p  «  n  that  y{x:  f(x)  >  0}  >  0;  and,  hence,  that  for  some  e  >  0,  B  = 
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{s :  /  (s)  >  e}  has  positive  measure.  Changing  /  on  a  95-null  set,  we  may  assume 
that  B  C  A. 

If  0  <  \p  G  $  is  carried  by  A,  then  A  =  A'  implies  \f/  «  <p.  To  prove  (19),  we 
first  assume  d\f//d<p  =  xh  for  some  H  C  A. 

Then 

(20)  Ai p(E)  =  ATH<p(E)  =  i](H )  =  JHf  dip  >  e  fHnBd<P  =  ^(B). 

The  usual  extension  procedures  yield  (19)  for  arbitrary  0  <  ^  «  <p,  equivalently 
for  every  VA\p,  (0  <  \f/  e  <f>).  If  0  <  e  $  is  arbitrary,  then  the  inequalities 

(21)  A f(E)  >  ATAt(E)  >  eTAt(B)  =  4(B) 
complete  the  proof. 

We  proceed  to  prove  the  invariance  of  P.  Assume  that  P  is  not  invariant.  Then 
there  exists  some  nonnegative  cp  E  <f>  with  <p(N)  =  0  and  A <p(N)  >  0.  Hence, 
there  is  an  index  i0  with  A^(X,0)  >  0.  Apply  lemma  2  with  A  =  P  and  Xi0  =  E. 
From  (9)  we  obtain 

(22)  M {Ann(X{0)}  =  M{ A"+V(At,)}  >  eM{A^(B)}  >  0, 
which  contradicts  (10). 

Let  us  now  consider  the  influence  of  the  null  part.  We  define  inductively  for 
every  <p  G  $, 

<Po  =  Tp<p,  <po  =  Tn<p, 

<Pk+l  —  VpAcpt,  <Pk  +  \  =  TnA<P%. 

An<p  =  An<pQ  +  •  •  *  +  AVn-l  +  ¥>w  +  <Pn 

follows  by  induction  and  ||v?*+i||  +  ||<pjt+i||  <  Ml  implies  £“=o  IMI  <  ||«p||. 
Therefore,  ^ P<p  =  '£k=oTp(ATN)k<p  defines  a  contraction. 

Lemma  3.  For  every  B  Q  P  we  have  M{Any(B)}  =  M  {An^pn(B}} . 

Proof.  Define  yic,  nt  by  (23).  First  note  that  B  C  P  implies  yt(B)  =  0  for 
every  n  since  P  is  invariant.  For  every  e  >  0  we  may  choose  no  so  large  that 
1 1 m*I i  <  €•  Then  from  (13)  and  (24)  we  derive  the  inequality 

(25)  \M {Ano+n/x(jB)}  -  M {(Ano+nyo  +  •  •  •  +  Anyw)(B)}\  <  e. 

Equation  (14)  implies  both  M {Ano+nn(B)}  =  M{Any(B )}  and 

(26)  M{(  A”“+nMo  +  •  •  •  +  AnHno)  (B)}  =  M{  Am+n(y0  +  •  •  •  +  n«*)(B)} 

=  M{  An(yo  +  •  •  •  +  Mno  )(B)}. 

Therefore,  M{ AnGu0  +  •  •  •  +  Hw)  (B)}  tends  (for  n0  — *  00)  to  M{Any(B)}  and  to 

(27)  M  |a-  (to  «)  (B)}  =  . 

The  essential  step  in  the  proof  of  theorem  1  is  an  application  of  theorem 
A.  Observe  that  P  is  invariant  and  that  ArPn  is  equivalent  to  TPn,  that  is, 
TPy  « 'S' ph  «  Tph.  If  n(P)  =  0,  theorem  1  is  now  trivial.  If  n(P)  >  0,  we  may 
assume  that  'S'pfi(P)  =  1  by  normahzing  the  measure.  Lemma  3  says  that  A 


(23) 
Then 

(24) 
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satisfies  condition  (ii)  of  theorem  A  applied  to  (P,  $  f|  P,  SI 'Py,  A).  Hence,  there 
exists  an  invariant  measure  <p  on  P  which  is  equivalent  to  S VPy,  and  therefore 
equivalent  to  TPy.  Set  <p(N)  =  0.  Then  /  =  dip/dy  is  invariant,  strictly  positive 
in  P  and  0  in  N,  which  proves  (ii).  That  P  Q  C  follows  from  (ii)  and  theorem  B, 
since  E*-o  Tkf  =  E“=o/  diverges  in  P.  The  proof  of  (iii)  rests  on  (ii),  theorem  C, 
theorem  D,  and  the  following  lemma. 

Lemma  4.  For  every  f  e  L\  the  sequence  {xpTkf,  k  =  0,  1,  •  •  •}  is  uniformly 
integrable. 

Proof.  For  g  e  Li  put  TPg  =  xpTq  and  TNg  =  xuTg.  We  define  for  /  e  L\ 
the  sequences  { fj }  and  {/*}  by 

/o  =  Xpf,  fo  =  XNf, 

'  '  _  rp  _  rp  f* 

Jk+ 1  —  JLpjJc,  Jk  +  1  —  J-  Njk* 

Then  ||/*+i||  +  \\fk+i\\  <  ||/*||  implies  E"=o||/j||  <  ||/||,  and  for  j  >1  we  have 
fj  =  TpT3n1{xn /).  From  the  invariance  of  P  we  conclude  that  xtfTng  =  T%g  and 
Tn(xpg)  =  Tp(xpQ)  for  all  g  e  L\.  It  is  now  intuitively  clear  and  also  follows  by 
induction  that 

(29)  xpTkf  =xp£  T’fk-j. 

i= o 

To  prove  uniform  integrabihty  of  {xpTkf},  we  may  and  do  assume  /  >0.  For 
a  given  e  >  0,  choose  l  =  (t  so  large  that 


(30)  E  ll/ill  <  6/6. 

j=t+i 

Let  h  =  xpTlf,  and  let  0  <  /  G  L\  be  invariant  and  strictly  positive  on  P.  We 
choose  m  so  large  that 

(31)  J  (h  —  mf)+dn  <  e/6, 

where  g+  =  max  {g,  0} .  Finally,  a«  >  0  may  be  chosen  so  large  that 

(32)  ^  <  e/ 6. 

It  follows  from  (30)  and  (31)  that  for  any  k  >  l, 

(33)  Ak  =  {x:  Tk~{(h  -  mf)+  +  Tk~l~lfi+\  +  •  •  •  +  T°fk  >  at} 
has  measure  y{Ak )  <  at_1-e/3.  The  inequality 

(34)  XpTkf  <mf+  Tk~((h  -  mf)+  +  Tk-<~'ft+i  +  •  •  •  +  T°fk 
implies 

(35)  P  H  {Tkf  >  2at}  c  {mf  >  aj  +  {mf  <  a(}  p|  Ak. 

Therefore, 

(36)  f  XpTkfdji<  I  mfdn+  (h-mft+dy 

J  {7’*/>2a<  j  Jinif>a.)  J 

+j  E  ll/ill  +  /  atdn  <  e. 

3=t+ 1  JAk 


dy,  <  e. 
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This  establishes  the  inequality 

<37>  L>c,  ‘ 

for  ct  >  2a£  and  all  k  >  l  so  that,  for  ct  large  enough,  (37)  holds  for  all  k  >  0. 

For  the  proof  of  (iii)  let  <p  G  4>  and /  =  d<p/dp.  Theorem  C  applied  to/  and  to 
p  =  f  states 

(38)  lim  n~l  "£  Tkf  =  h(f,  f)  -J 

n — >  »  k  =  0 

in  P.  Since  /  =  xpJ  we  have  Ref  =  /.  By  theorem  D 

(39)  xph(f , 7) •/  =  xp/P(Pc/|5)/P(/|3)  g  l,. 

Lemma  4  implies  that  (xpn-1  £”io  Tfc/}  is  uniformly  integrable;  hence, 
XpW-1  X*=o  T*/  tends  to  xph(f,  f)  •/  in  norm.  This  limit  is  shown  to  be  invari¬ 
ant  by  a  cancellation  argument.  Norm  convergence  implies  convergence  of 
limrt_>oo  n_1  AVC4  Pi  P )  for  any  A  e  ( F  since  Ak<p(A  P  P)  =  Jarp  TV dn. 
The  theorem  is  thereby  completely  proved. 

Remark.  (1)  Dean  and  Sucheston  ([6],  theorem  3)  have  shown  the  follow¬ 
ing:  for  <p  =  p,  and  P  =  X  n~x  Ak+i<p(A  P  P)  converges  uniformly  in  i. 
We  mention  that  this  remains  true  for  general  <p  G  $  and  P  as  may  be  shown  by 
extending  their  method  to  the  present  case.  The  application  of  their  proposition 
3  must  then  be  replaced  by  an  application  of  lemma  4  of  this  paper. 

(2)  Let  f0  G  L\  be  strictly  positive.  Neveu  [20]  mentions  that  P  is  the  inter¬ 
section  of  all  sets  (£“=o  T"/o  =  +00}  where  {n,}  runs  through  all  subsequences 
of  the  nonnegative  integers.  Another  characterization  of  P  in  terms  of  T  is 
given  by  proposition  1. 

Proposition  1.  Define  Pf  for  0  <  /  G  Lx  by 

(40)  Pf  —  \x\  lim  inf  w-1  £  Tkf{x )  >  oV 

Then  P  =  Pf  holds  for  all  f  G  Lx,  which  are  strictly  positive  in  X. 

Proof.  Theorem  C  implies  that  P}  does  not  depend  on  the  choice  of  a 
strictly  positive  /  G  Lx.  Taking  /  =  /  +  xv  with  f  =  Tf  strictly  positive  in  P, 
we  obtain  Pf  3  P.  Next  take  /  =  1.  For  every  Xif 

(41)  0<  /  ( lim  inf  n-1  £  Tkf)  dp  <  lim  inf  n~l  £  /  Tkf  dp 

jXi  \  n— k  —  0  )  n— >oo  k  =  0  jXi 

=  lim  inf  n-1  "£  A kp{Xf)  <  M{A"p(X%)}  =  0; 

n— >«o  fc  =  0 

hence,  all  X,  belong  to  Pf 

Property  (iv)  of  theorem  1  might  lead  one  to  expect  the  convergence  to  0  of 
(1)  even  with  X,  replaced  by  N.  However,  in  that  case  the  C6saro  averages  of 
A np(N)  will  usually  decrease  to  a  positive  lower  bound,  and  for  appropriate 
A  C  N  the  numbers  n-1  £*=o  A kp(A)  may  oscillate.  This  observation  is  due  to 


422 


FIFTH  BERKELEY  SYMPOSIUM:  KRENGEL 


Mrs.  Dowker  ([7],  theorem  3),  who  considered  ergodic  point  mappings.  We 
mention  the  following  generalization  of  her  result. 

Proposition  2.  If  y  is  nonatomic  ( that  is,  every  A  with  y(A)  >  0  contains  a 
B  with  0  <  y(B )  <  y(A))  and  P  =  0,  and  ||AV||  =  1  for  all  Jc,  then  for  any  given 
a,  /3  with  0  <  a  <  (3  <  1,  there  exists  a  set  A  a<i jGJ  with 

n  —  1 

limsupn-1  Y  A ky(Aa,j)  =  /3, 

(42) 

n  —  1 

lim  inf  n~l  Y  A ky(Aa>j)  =  a. 

n— > »  k  =  0 

Proof.  Clearly,  the  measures  yn  =  n~lYk  = \Aky  are  equivalent  to  /x.  By 
theorem  1  (iv)  the  following  lemma  of  Mrs.  Dowker  is  applicable. 

Lemma  A.  Let  X  he  the  disjoint  union  of  countably  many  sets  Xiy  and  let  y  he 
nonatomic  and  {yn}  a  sequence  of  normalized  measures  which  are  equivalent  to  y 
and  such  that  lim„_»*  yn(Xi)  =  0  for  every  X,.  Then  for  any  given  a,  with 
0  <  a  <  <  1  there  exists  a  set  Aa^  with  (42). 

4.  Weighted  conservative  parts 

We  now  introduce  “weighted  conservative  parts  Cw”  by  considering  ex¬ 
pressions  Yk-oWkTkf  with  monotonically  decreasing  weights: 

(43)  w  =  {w*},  k  =  0,  1,  •  •  •  ;  wk  >  wk+ 1  >  0. 

The  result  will  be  a  finer  splitting  of  the  conservative  null  part  C  O  N  (see 
example  1).  In  the  elementary  special  case  of  Markov  chains  we  may  consider 
this  splitting  as  a  classification  of  the  null-recurrent  states  according  to  the 
speed  of  convergence  to  0  of  the  “probabilities  of  return  at  time  n.”  This  classi¬ 
fication  does  not  yield  nearly  as  precise  statements  about  C  O  N  as  some  results 
of  Vere-Jones  [24]  and  Kingman  [19]  do  about  D.  It  seems,  however,  to  be  the 
first  suggestion  of  any  method  for  further  classification  of  CON  and  might  be 
of  interest  as  a  common  generalization  of  both  the  decompositions  X  =  C  +  D 
and  X  =  P  +  N. 

Ergodic  theorems  with  weighted  averages  were  first  introduced  by  Baxter  [1], 
who  used  recurrence  probabilities  as  weights.  Jamison,  Orey,  and  Pruitt  [18] 
showed  that  far  more  general  weights  may  be  used  for  the  summation  of  inde¬ 
pendent  identically  distributed  random  variables. 

We  first  state  two  elementary  lemmas  which  make  it  possible  to  replace 
C6saro  means  by  weighted  averages  with  monotonically  decreasing  weights  in 
practically  all  theorems  of  pointwise  ergodic  theory. 

Lemma  5.  Let  {/*},  {pk},  {wk}  he  three  sequences  of  real  numbers  (k  =  0,1,  •  •  •) 
with  Pk  >  0,  Y*=o  Pk  >  0,  and  (43).  If 

Z  /*  /  Z  Pk 

k= 0  /  fc-0 


(44) 
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converges  to  a  finite  limit,  then  so  does 

n— 1  /«— 1 

(45)  £  wkfk  /  £  wkpk. 

7n  the  case  ££=0  =  <x>,  the  limit  is  the  same. 

(The  special  case  where  all  pk  >  0  and  £  wkpk  =  »  is  equivalent  to  a  known 
theorem  of  the  theory  of  summability;  see,  for  example,  Hardy  ([11],  p.  309).) 

Proof.  Put  s„  =  £o  fk,  rn  =  £g  Pk,  sw,n  =  £o  wkfk,  and  rw,n  =  £3  wkpk. 
By  a  translation/*  — >fk  —  \pk  we  may  and  do  assume  the  limit  of  (44)  to  be  0. 
Furthermore,  it  is  sufficient  to  assume  p0  >  0.  We  investigate  the  ratios 

n  —  1 

£  8k(wk  ~  Wk+ 1)  +  SnWn 

(46)  ^  =  !=? - ’ 

£  r*(w*  -  wk+ 1)  +  rnwn 

k  =  0 

which  are  obtained  by  means  of  the  Abel  transformation.  The  ratios  bk  =  skrkl 
tend  to  0.  Put  sk  =  bkrk  in  (46).  If  the  denominator  remains  bounded,  then  rnwn 
is  bounded  and  £*=0  rk(wk  —  wk+i)  <  ».  In  that  case,  £?=o  bkrk(wk  —  wk+ 1) 
converges  and  snwn  tends  to  0.  If  the  denominator  tends  to  infinity  and  e  >  0 
is  given,  choose  K  such  that  \bn\  <  e/2  for  n  >  K,  and  then  choose  L  >  K  such 
that  | ryl  £f=0  sk(wk  —  wk+l) \  <  e/2.  Then  <  e  for  all  n  >  L. 

While  lemma  5  carries  over  the  convergence  theorems  and  the  theorems  on 
identification  of  the  limit  to  the  case  of  monotone  averages,  we  need  a  second 
lemma  for  the  proof  of  a  maximal  ergodic  theorem  and  a  dominated  ergodic 
theorem. 

Lemma  6.  If  x0,  •  •  •  ,  xn  are  any  real  numbers  yo  >  0,  yi,  •  •  •  ,  yn  >  0,  and 
0  <  w0  >  Wi  >  •  •  •  >  wn  >  0,  then 

,A„\  _  x0  +  1  ’  •  +  Xk  ^  WqXo  +  •  •  •  +  wkxk 

(47)  max  - ; - : -  >  max  - - - : - 

o<jfc<n  Vo  ~r  •  •  •  ~r  yic  o<k<n  u}oyo  +  •  •  •  +  wkyk 

Proof.  We  may  assume  wk  >  0,  (k  =  0,  •  •  •  ,  n).  Let  X  be  the  expression  on 
the  right-hand  side  of  (47),  and  let  k  be  the  first  index  for  which  the  quotient 
equals  X.  Then 

(48)  WjXj  +  •  •  •  +  wkxk  >  \{wjyj  +  •  •  •  +  wkyk) 

for  all  j  with  0  <  j  <  k.  Find  a0,  •  •  •  ,  ock  successively  by  solving  the  equations 
Wj  £{>o  =  1  for  j  =  0,  •  •  •  ,  k.  The  monotonicity  of  the  Wj  implies  a;  >  0, 

and  then  (48)  implies 

oio(w0Xo  +  WiX  1  +  •  •  •  +  wkxk)  >  \a0(w0y0  +  W\jf\  +  •  •  •  +  wkyk), 

(49)  ai(wiXi  +  •  •  •  +  wkxk)  >  Xcniwiyi  +  •  •  •  +  wkyk), 

dkwkxk  ^  \akwkyk. 

Adding  these  inequalities  we  obtain  (x0  +  •  •  •  +  xh)  >  \(yo  +  •  •  •  +2/*). 
Remark.  We  mention  another  inequality  which  may  easily  be  derived  from 
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(47).  Let/*  be  real  numbers  (k  =  0,  1,  2,  •  •  •)  such  that  X“=o  X */*  converges  for 
every  X  with  0  <  X  <  1,  and  let  p*  be  nonnegative  and  po  >  0.  Then 


(50) 


sup 

0<X<1 


E  x% 

E  X*p* 

0 


<  sup 

0  <71  <00 


E/* 

_0 _ 

n 

E  p* 
0 


Apply  lemma  6  to  the  classical  dominated  ergodic  theorem  (see,  for  instance) 
Jacobs  [17])  and  observe  that  it  is  sufficient  to  prove  this  theorem  for  /  >  0  and 
positive  operators  T.  Then  a  dominated  ergodic  theorem  with  monotone  weights 
follows. 

Next  let  us  apply  lemma  6  to  Hopf’s  maximal  ergodic  theorem  [12].  Put  for 

feLh 


(51) 

(52) 


En  =  lx  e  X :  max  X  T'f  ( x )  >  o\> 
L  0  <k <n  t  =  0  J 

I 

'w,n  ^  •* 


X:  max  X  WiTiJ{x)  >  0 

0<A<ti  1  =  0 


}• 


Hopf’s  theorem  states  that 

(53)  fjtoZ  0 


for  any  /  G  L\  and  n  >  1.  Applying  lemma  6  we  obtain  Ew<n  C  En.  Since 
{x:  f  (x)  >  0}  Q  Ew,n,  (53)  now  implies  ]/;„,„/  dn  >  0.  This  is  the  desired  maxi¬ 
mal  ergodic  theorem  with  monotone  weights.  We  mention  that  Garcia  [8]  has 
presented  a  very  short  proof  of  (53). 

Rota’s  [21]  basic  lemma  as  well  as  his  dominated  ergodic  theorem  of  the 
Abel  type,  follow  in  a  similar  way  from  (50).  A  generalization  of  the  Riesz 
lemma  may  be  obtained  by  applying  the  idea  of  the  proof  of  lemma  6.  An 
ergodic  theorem  for  continuous  flows  and  monotone  weights  wt  (t  >  0)  may  be 
proved  by  using  lemma  5  and  an  extension  of  the  usual  method  applied  to 
wt  =  1  (see,  for  example,  Jacobs  [17]). 

The  main  result  of  this  section  needs  for  its  proof  only  the  maximal  ergodic 
theorem  with  monotone  weights.  However,  as  we  assume  knowledge  of  theorem 
C  in  this  paper,  the  easiest  derivation  uses  lemma  5. 

Take  w  =  {w*}  as  in  (43)  and  define  the  operators  Sw,n  in  Li  by 

(54)  Sw,ng  =  X)  wkTkg. 

k= 0 


From  theorem  C  and  lemma  5  we  infer  that  for  /,  p  G  L\,  p  >  0  the  ratios 
(55)  Qw,n(fy  V )  =  Sw,nf/Sw<np 

converge  to  a  finite  limit  on  (a;:  X”=o  Tkp(x )  >0}.  If  pi,  pi  G  L\  are  strictly 
positive  in  X,  we  derive  from  (55)  with  /  =  Pi,  P  =  Pi  and  with  /  =  Pi,  p  =  Pi 
that  Cw,i  =  {x:  X“=o  wkTkpi  =  «>}  is  independent  of  i.  We  define  the  w-con- 
servative  part  Cw  of  X  (with  respect  to  T)  to  be  Cw,i.  The  set  Dw  =  X  —  Cw 
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is  called  the  w-dissipative  part  of  X  (with  respect  to  T).  We  now  generalize 
theorem  B. 

Theorem  2.  If  T  is  a  positive  contraction  in  Li,  then  X  is  the  disjoint  union  of 
two  uniquely  determined  sets  Cw  and  Dw  such  that 

(i)  for  every  f  >  0,  £*=0  wkTkf  converges  in  Dw; 

(ii)  for  every  f  >  0,  L“=o  wkTkf  diverges  in  Cw  H  { x :  £"=o  Tf  >  0} ; 

(iii)  Cw  is  invariant. 

Proof.  Properties  (i)  and  (ii)  are  immediate  by  the  same  kind  of  argument 
which  proved  that  Cw, i  equals  Cw, 2. 

To  prove  (iii)  we  first  remark  that  for  any  integrable  g  the  inequality 
Tg+  >  ( Tg)+  holds  since  T  preserves  order.  If  p  =  ph  pw<k  =  WiT'p,  and 
p  =  Ef-o  (wi  —  Wi+i )  Ti+1p,  then  p  is  integrable;  hence  Tpw-k  <  pw-°°  +  p  is 
bounded  in  Dw. 

If  Cw  is  not  invariant,  then  there  exists  an  integrable  /  >  0  such  that  /  =  0 
in  Dw  and  Tf  is  positive  in  a  subset  of  Dw  of  positive  measure.  Then  there  is 
some  n  >  1  with 

(56)  ||  {nTf  —  pw'x  —  p)+||  =  a  >  0. 

Since  pw<k  |  °o  in  Cw, 

(57)  \\(nf  -  pw’k)+\\  <  a/2 
for  sufficiently  large  k.  The  estimate 

(58)  || (nTf  -  p”'*>  -  p)+ 1|  <  \\(nTf  -  Tpw-k)+ 1|  <  || T(nf  -  p^k)+\\ 

<  ||(n/  -  pw*fc)+|| 

makes  it  evident  that  (57)  contradicts  (56).  (This  proof  uses  the  same  idea  as 
known  proofs  of  the  special  case  wk  =  1,  but  avoids  unnecessary  complications.) 

We  now  describe  Cw  and  Dw  in  terms  of  A  and  <f>,  since  <f>  is  the  space  of  most 
interest  if  A  is  given  by  a  stochastic  kernel.  Such  a  description  will  also  be  useful 
in  the  proof  of  theorem  3  and  it  seems  to  make  the  relation  to  the  concept  of 
recurrent  states  in  the  theory  of  Markov  chains  a  bit  more  transparent.  It  is, 
however,  equivalent  to  theorem  2. 

Theorem  2*.  If  A  is  a  positive  contraction  in  $,  then  X  is  the  disjoint  union  of 
two  uniquely  determined  sets  Cw  and  Dw  such  that 

(i)  for  every  $  G  <f>  with  <p  >  0,  Dw  is  a  countable  union  of  sets  Xv  with  the 
property  wkAk<p(Xy)  <  «  ; 

(ii)  for  every  0  <  <p  G  <f>  and  any  A  Q  Cw,  the  series  Jjk=o  wkAk(p{A)  diverges 
or  is  0; 

(iii)  Cw  is  invariant. 

Proof,  (i)  Let  /  =  d(p/dy  and  Xv  —  { x :  v  <  £"=  o  wkTkf(x)  <  v  +  1},  v  = 
0,  1,  •  •  •  .  Then  (i)  follows  from  A k<p(B)  =  fs  Tkf  dy,  and  (ii)  follows  from  the 
same  relation,  since  Jjk=o  wkTkf{x)  diverges  in  Cw  O  {H"=o  Tkf(x)  >  0}. 

It  would  be  of  interest  to  know  whether  the  decomposition  {X,}  of  Dw  (like 
that  of  N )  may  be  given  for  all  <p  E  $  simultaneously. 
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Let  us  now  show  that  for  any  A  the  positive  part  P  equals  Cw  for  some 
w  =  {w*}. 

Lemma  7.  Let  { am,k }  (m  =  0,  1,  •  •  •  ;  k  =  0,  1,  •  •  •)  be  an  infinite  matrix  of 
nonnegative  numbers  such  that 

n  —  1 

(59)  lim  n~l  E  am,k  =  0 

n— k= 0 

for  any  m.  Then  there  exists  a  decreasing  sequence  w  =  {wk} ,  (k  =  0,1,  •  •  •)  of 
positive  numbers  such  that  E*=o  wk  diverges  and  E“=o  wkam,k  converges  for  every  m. 

Proof.  Put  n0  =  0,  and  choose  n\  >  1  such  that  n  >  Wi  implies 
n~l  Eo"1  &o,k  <  If  ni  <  n2  <  •  •  •  <  n*  are  chosen,  we  next  choose  ni+1  >  ni 
so  large  that  n*+i  —  nt  >  n,  —  n,_i  and  such  that  for  n  >  ni+i  and  m  =  0,  •  •  •  ,  i, 

(60)  (n-n*)-1  <2-<*+i>. 

k  =  m 

We  put 

(61)  Wo  =  •  •  •  =  wm- i  =  nr1,  •  “,wni=  •  •  •  =  wni+1- i  =  (fti+i  —  ni)-1. 

Then  { wk }  decreases,  E“=o  wk  =  <x>  since  any  weight  (nt+i  —  nf)~l  occurs 
(n1+ 1  —  n,)  times,  and  we  have 

n.+i  — 1  ru+i  — 1 

(62)  E  wkam,k  =  (ni+i  -  ni)-1  E  am,k  <  2~(i+1> 

k  =  m  k  =  m 

for  m  —  0,  •  •  •  ,  i.  Therefore  E“=o  wkam<k  converges  for  every  m. 

Theorem  3.  The  positive  part  P  of  X  with  respect  to  A  is  the  intersection  of  all 
parts  Cwfor  which  E*=o  w*  =  <» .  More  precisely,  for  every  A  there  exists  aw  =  {wk} 
with  (43)  and  E"=o  ^  =  00  and  P  =  Cw. 

Proof.  Let  E"=o  wk  =  oo.  If  0  <  p  e  4>  is  invariant  and  equivalent  to  Tpy,, 
then  E“=o wkA.k(p(A)  =  E*=oW*<pC4)  =  00  for  any  A  QP  with  u(A)  >  0. 
Therefore  P  Q  Cw. 

Next  let  {Xi,  i  =  1,  2,  •  •  •  be  a  decomposition  of  N  such  that 

(63)  lim  n~l  E  A ky(Xi)  =  0. 

n— » w  k  —  0 

By  lemma  7  we  may  find  a  sequence  w  =  {wk}  for  which  E"=o  wk  =  oo  and 
E?=o  wkKkn{Xi)  <  oo  for  all  i.  Theorem  2*  (ii)  now  implies  n(Xi  Pi  Cw )  =  0. 
Hence,  N  Pi  Cw  =  0,  or  equivalently,  P  3  Cw. 

Corollary  1.  The  following  condition  is  necessary  and  sufficient  for  the  exist¬ 
ence  of  a  finite  invariant  measure  <p  with  n  «  <p  «  u-  E*=o  wkTkxx  diverges  in  X 
for  every  {wk}  with  (43)  and  E“=o  wk  =  oo. 

The  next  corollary  may  replace  the  ergodic  theorem  in  those  cases  when  the 
ergodic  theorem  does  not  hold.  That  occurs  quite  frequently  according  to  the 
following  result  of  A.  Ionescu  Tulcea  [14] :  among  all  positive,  linear,  invertible 
isometries  T  of  the  space  Li(0,  1)  of  Lebesgue-integrable  functions  on  (0,  1), 
the  operators  T  which  do  not  satisfy  the  pointwise  ergodic  theorem  form  a  set 
of  the  second  category  in  the  strong  operator  topology. 
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Corollary  2.  For  any  positive  contraction  T  in  L\  there  exists  a  decreasing 
sequence  w  =  {wk},  with  £“=0  wk  =  »,  such  that 

(64)  lim  ( wkTkf)  / (  L1  wk) 

exists  a.e.  for  any  f  e  L\.  The  limit  is  invariant,  vanishes  in  N,  and  is  given  by 
(39)  in  P.  In  particular  it  is  strictly  positive  in  P  H  (H“=o  Tkf  >  0}  for  f  >  0. 

Proof.  Choose  w  with  P  =  Cw.  Then  ]T)"=o  wkTkf  converges  in  N.  The  other 
statements  follow  from  lemma  5  and  the  considerations  at  the  end  of  the  proof 
of  theorem  1. 

Remark.  The  assumption  y(X)  =  1  is  not  essential  in  this  paper;  y  may  be 
<r-finite,  since  $>  depends  only  on  the  null  sets  of  y.  Without  this  assumption, 
however,  the  formulation  of  theorem  D  and  its  applications  is  somewhat  more 
complicated. 


5.  The  elementary  special  case  of  Markov  chains 

We  now  adopt  the  terminology  of  Chung  [5].  Let  I  be  a  countable  state  space, 
(Vij)  h  j  e  7  the  matrix  of  the  n-step  transition  probabilities  of  a  stationary 
Markov  chain,  and  ftj  the  probability  that  starting  at  i  the  first  visit  to  j  takes 
place  at  time  n.  Let  w  =  {wk,  k  >  0}  with  wk  >  wk+i  >  0.  We  call  the  state 
i  E  I 

(i)  w-recurrent  in  the  case  wkp$  =  00 ,  and 

(ii)  w-nonrecurrent  otherwise. 

In  the  special  case  wk  =  1  we  say  recurrent  and  nonrecurrent  respectively.  The 
relation  to  the  results  of  sections  3  and  4  becomes  obvious  if  we  put  X  —  I  and 
let  y  be  a  measure  which  assigns  positive  measure  to  every  point.  Then  $  consists 
of  all  signed  finite  measures  on  X.  If  A  is  generated  by  (pij),  then  Cw  consists  of 
the  w-recurrent  states  and  P  consists  of  the  positive  states.  In  particular,  in  this 
case,  theorem  2*  states  the  following. 

The  property  of  being  w-recurrent  or  w-nonrecurrent  is  a  class  property. 
The  set  Cw  of  w-recurrent  states  is  closed.  Of  course  this  also  follows  easily  from 
estimates  of  the  type 

(65)  p<?+*+»)  <  ptfp&p® 

and  from  £“=o  (wk  —  w*+i)  <  w0. 

Let  us  convince  ourselves  that  the  classification  by  the  sets  Cw  may  split  the 
set  of  recurrent  null  states  into  proper  subclasses. 

Example  1.  There  are  Markov  chains  such  that  for  one  state  i  the  first 
return  probabilities  in  >  0)  are  an  arbitrary  probability  distribution  with 
=  0.  Simply  take  i  —  1  and  pass  from  1  to  state  k  with  probability  ;  then 
from  state  k  >  2  pass  through  (k  —  2)  auxiliary  states  associated  with  k  deter¬ 
ministically,  finally  going  back  to  1.  The  generating  functions  F(s)  and  P(s)  of 
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{f'Ci}  and  {p$}  satisfy  P(s )  =  (1  -  / (s))-1.  Take  F(s )  =  1  -  (1  -  s)p  for  some 
p  with  0  <  p  <  1.  Then 

(66)  P(s)  =  (1  -  s)-*  =  r(p)-1  £  r(n  +  p)T(n  +  1)-V 

n  —  0 

has  coefficients  p$  with  pi”)r(p)n1~p  — » 1  (see,  for  example,  Titchmarsh  [23], 
p.  57-58).  We  may  construct  a  decomposable  Markov  chain  which  has  only 
recurrent  null  states  using  two  such  chains,  say  with  p  =  |  and  p  —  However, 
only  the  states  of  the  part  with  p  =  2  are  w-recurrent  with  w  =  {(1  -f*  k)~112}. 
Of  course  one  would  not  need  monotone  weights  to  separate  the  classes  if  the 
probabilities  p(u  behave  so  regularly.  The  advantage  of  the  classification  Cw  is 
its  general  applicability. 
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